Machine Translation with Many Manually Labeled Discourse Connectives
نویسندگان
چکیده
The paper presents machine translation experiments from English to Czech with a large amount of manually annotated discourse connectives. The gold-standard discourse relation annotation leads to better translation performance in ranges of 4–60% for some ambiguous English connectives and helps to find correct syntactical constructs in Czech for less ambiguous connectives. Automatic scoring confirms the stability of the newly built discourseaware translation systems. Error analysis and human translation evaluation point to the cases where the annotation was most and where less helpful.
منابع مشابه
Using Sense-labeled Discourse Connectives for Statistical Machine Translation
This article shows how the automatic disambiguation of discourse connectives can improve Statistical Machine Translation (SMT) from English to French. Connectives are firstly disambiguated in terms of the discourse relation they signal between segments. Several classifiers trained using syntactic and semantic features reach stateof-the-art performance, with F1 scores of 0.6 to 0.8 over thirteen...
متن کاملMachine Translation of Labeled Discourse Connectives
This paper shows how the disambiguation of discourse connectives can improve their automatic translation, while preserving the overall performance of statistical MT as measured by BLEU. State-of-the-art automatic classifiers for rhetorical relations are used prior to MT to label discourse connectives that signal those relations. These labels are used for MT in two ways: (1) by augmenting factor...
متن کاملA Corpus-based Contrastive Analysis for Defining Minimal Semantics of Inter-sentential Dependencies for Machine Translation
Inter-sentential dependencies such as discourse connectives or pronouns have an impact on the translation of these items. These dependencies have classically been analyzed within complex theoretical frameworks, often monolingual ones, and the resulting fine-grained descriptions, although relevant to translation, are likely beyond reach of statistical machine translation systems. Instead, we pro...
متن کاملThe Role of Expectedness in the Implicitation and Explicitation of Discourse Relations
Translation of discourse connectives varies more in human translations than in machine translations. Building on Murray’s (1997) continuity hypothesis and Sanders’ (2005) causality-by-default hypothesis we investigate whether expectedness influences the degree of implicitation and explicitation of discourse relations. We manually analyze how source text connectives are translated, and where con...
متن کاملMultilingual Annotation and Disambiguation of Discourse Connectives for Machine Translation
Many discourse connectives can signal several types of relations between sentences. Their automatic disambiguation, i.e. the labeling of the correct sense of each occurrence, is important for discourse parsing, but could also be helpful to machine translation. We describe new approaches for improving the accuracy of manual annotation of three discourse connectives (two English, one French) by u...
متن کامل